Conversation
There was a problem hiding this comment.
1 issue found across 9 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="kernel/debug_trace.c">
<violation number="1" location="kernel/debug_trace.c:65">
P2: `debug_trace_event()` updates shared ring-buffer state without the trace lock, enabling races/lost accounting in re-entrant or preempted execution.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| return (uint8_t)task_id; | ||
| } | ||
|
|
||
| void debug_trace_event(uint8_t event_type, uint32_t param1, uint32_t param2) |
There was a problem hiding this comment.
P2: debug_trace_event() updates shared ring-buffer state without the trace lock, enabling races/lost accounting in re-entrant or preempted execution.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At kernel/debug_trace.c, line 65:
<comment>`debug_trace_event()` updates shared ring-buffer state without the trace lock, enabling races/lost accounting in re-entrant or preempted execution.</comment>
<file context>
@@ -0,0 +1,166 @@
+ return (uint8_t)task_id;
+}
+
+void debug_trace_event(uint8_t event_type, uint32_t param1, uint32_t param2)
+{
+ debug_event_t *event = &event_buffer[event_write_index];
</file context>
ae91cc3 to
8b71090
Compare
visitorckw
left a comment
There was a problem hiding this comment.
Please follow the project's commit message format. Since many of your commits contain significant changes, please add detailed descriptions in the commit bodies.
I don't think this block is necessary. |
| if (task_id > 0xFFU) | ||
| return 0xFFU; | ||
|
|
||
| return (uint8_t) task_id; |
There was a problem hiding this comment.
Why do we need to truncate a 16 bit id to 8 bit id?
There was a problem hiding this comment.
We do not need that truncation. This came from following the initial draft too literally.
I will remove the 8-bit narrowing and update debug_event_t accordingly so the trace format stays aligned with the actual task-id width.
There was a problem hiding this comment.
The 8-bit truncation was removed, and debug_event_t now keeps task_id aligned with the kernel's actual task-id width.
| uint32_t param2; | ||
| } debug_event_t; | ||
|
|
||
| typedef char debug_event_t_size_check[(sizeof(debug_event_t) == 16) ? 1 : -1]; |
There was a problem hiding this comment.
Why would it be a serious error if sizeof(debug_event_t) != 16?
What if we want to add more fields here later?
There was a problem hiding this comment.
The hard 16-byte enforcement is too rigid for the current prototype. I will remove that fixed-size assertion and let debug_event_t be defined by the actual event fields we need, instead of treating 16 bytes as an invariant.
There was a problem hiding this comment.
The hard fixed-size assertion was removed, and the record layout is no longer treated as a strict invariant.
|
|
||
| /* Add to cache and mark ready */ | ||
| /* Publish the task to the scheduler and trace it as one atomic step. */ | ||
| CRITICAL_ENTER(); |
There was a problem hiding this comment.
This is a no-go for me.
It degrades the scheduler's performance just for a new debug feature.
Perhaps we should consider including this code only in debug builds?
There was a problem hiding this comment.
The current shape does add unconditional scheduler-path work for a debug-only feature, which is not acceptable.
I will rework this so the tracing path is only included in debug-enabled builds, in line with the zero-overhead requirement from #25.
There was a problem hiding this comment.
The scheduler trace call sites were revised so tracing-disabled builds no longer pay the trace-only scheduler overhead.
| } while (0) | ||
| #define debug_clear_events() \ | ||
| do { \ | ||
| } while (0) |
There was a problem hiding this comment.
IMHO, It would be better to use empty static inline functions here instead of macros.
It ensures type safety across all builds and prevents potential unexpected side effects.
There was a problem hiding this comment.
Empty static inline functions are the better disabled-build path here.
I will switch the no-op interface to that form.
There was a problem hiding this comment.
The disabled-build tracing interface now uses empty static inline helpers instead of macros.
|
I saw that you re-requested my review, but I don't see any updates to the code or commits? |
|
I re-requested review too early and had not pushed the follow-up changes yet. Sorry for the wasted review cycle. |
Linmo currently lacks an event tracing mechanism for scheduler activity, which makes it harder to inspect task lifecycle changes and context switch behavior during debugging. Introduce a tracing core backed by a fixed-size ring buffer and wire it into the scheduler event paths required by the initial tracing scope. This adds trace records for task creation, cancellation, switching, suspension, resumption, delay, and explicit yield. The tracing interface is defined under `include/sys/debug_trace.h`, with the recording logic implemented in `kernel/debug_trace.c`. Scheduler integration is added in `kernel/task.c`, and the build system is updated to include the tracing core only when enabled by configuration. The trace record layout is kept compact while aligning `task_id` with the kernel's actual task identifier width. The disabled build path uses empty static inline helpers so trace calls type-check cleanly while compiling out the tracing implementation.
Add `trace_sched`, a deterministic validation application for the initial scheduler tracing implementation. The application drives a controlled sequence of task creation, explicit yield, delay, suspend, resume, destroy, and context switching, then checks that the expected scheduler events are present in the trace buffer. It also intentionally overflows the ring buffer to validate wraparound behavior and overwrite accounting. A disabled-tracing path is included so the same application can verify that `CONFIG_DEBUG_TRACE=0` builds still compile and run correctly.
Add QEMU-side tooling to profile the tracing hot path without changing kernel behavior. This commit introduces a TCG plugin that instruments the `debug_trace_event()` function range and records instruction and store counts during execution. `trace_sched` is extended to print aggregate trace totals so the profiling output can be normalized to instructions-per-event and stores-per-event. The profiling support is intended for first-round cost characterization of the tracing path. It provides a reproducible emulator-based measurement method, but it does not claim hardware-cycle accuracy.
Document how to build and run the QEMU tracing profiler introduced for the scheduler tracing work. The guide explains the required environment variables, plugin build steps, symbol lookup for debug_trace_event(), and the QEMU command line used to reproduce profiling runs. It also describes how to combine the plugin output with trace_sched diagnostics to derive normalized per-event metrics.
8b71090 to
c9727c5
Compare
|
I reviewed the current scheduler tracing call sites more carefully and they are not protected uniformly.
|
Using mo_task_id() inside debug_trace_event() adds an extra out-of-line call in a scheduler hot path. Replace it with an inline current-task ID read to keep the trace path smaller while preserving task_id semantics. This change is intentionally narrow. It reduces the hot-path footprint.
|
After going through the changes, two things came up. First, the existing code routes through The other point is about the new |
Summary
As described in #25, Linmo currently lacks an event tracing system, which makes it difficult to debug asynchronous and preemptive kernel behavior.
This PR introduces the first scheduler-focused stage of the event tracing work, together with validation and profiling support for evaluating the implementation.
Functional scope
This PR includes:
debug_trace_event()Test Plan
make clean && make trace_schedmake clean && CFLAGS='-DCONFIG_DEBUG_TRACE=0' make trace_schedtrace_schedreportsOverall: PASStrace_schedreportsTracing disabled build: PASStrace_schedwraparound testmake -C tools/qemu QEMU_PLUGIN_INC=/path/to/qemu/includetools/qemu/README.mdEvaluation Setup / Reproduction
The profiling setup and reproduction steps included in this PR are documented in:
tools/qemu/README.mdPATH=/path/to/riscv/bin:$PATHCROSS_COMPILE=<your-riscv-tool-prefix>qemu-system-riscv32ccQEMU_PLUGIN_INC=/path/to/qemu/includepkg-configwithglib-2.0Performance Notes
This PR keeps tracing under compile-time control and is intended to compile out tracing-related scheduler work when tracing is disabled.
Enabled and disabled
trace_schedbuilds were revalidated locally after the latest changes.This PR also includes QEMU-based profiling support for first-round tracing cost characterization.
The profiling workflow is documented in
tools/qemu/README.md. Since the tracing path changed during thelatest review updates, the previously collected profiling numbers should be treated as provisional and refreshed in a follow-up measurement pass.
This PR does not claim that the final performance target in #25 has been met.
Notes
Future Work
Summary by cubic
Adds v1 of the kernel event tracing system focused on the scheduler to improve debugging of async/preemptive behavior. Includes a ring-buffer trace core, scheduler event hooks, a validation app, QEMU-based profiling for
debug_trace_event(), and a small hot-path optimization.New Features
debug_trace_*API; 16-byte event records (timestamp + task ID); lock-free writes, interrupt-safe reads; defaultsCONFIG_DEBUG_TRACE=1,DEBUG_EVENT_BUFFER_SIZE=256; no-op stubs when disabled.CREATE(id, prio),DESTROY(id, prev_state),SWITCH(prev_id, next_id),SUSPEND/RESUME,DELAY(ticks), and explicitYIELD(API only).trace_schedapp: deterministic checks for event presence/order, wraparound overwrite reporting, and monotonic timestamps; prints totals; dumps events on failure; added to app targets.debug_trace_event(); counts instructions and stores within a symbol PC range; includestools/qemuMakefile and-icountsteps.Refactors
debug_trace_event()(removesmo_task_id()call).Written for commit bc3c565. Summary will update on new commits.